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ABSTRACT 



Information deduplication is a system for taking out duplicates of information, for better utilization of storage room and data transfer 
capacity. There is only one copy of each document in cloud, possibility there may be N number of user for the same document. The data 
which is outsourced by user to cloud must b delicate information and it should be protected by leaking. In this paper we introduce TPA 
with secure distributed system for information integrity and tag consistency. The TPA is a public verifier that verifies that the data 
stored by the user is unchanged or corrupted in the cloud. 
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Introduction: 

There are two types of deduplication: 

i. File level deduplication, which reduce redundancies between 
diverse documents and eliminate the duplicate copy. 

ii. Block level deduplication, the record is fragmented into blocks 
of fixed or variable size and then deduplication is performed to 
check the similar content in the files. 

Ramp secret sharing method is use to divide the file into N no. of 
shares and distribute across the servers. Share and Recover algo- 
rithm is used to share and recover the data from distributed 
server. By using this method data can be recovered in case if the 
data is lost or corrupted in cloud without letting user know about 
it. 

TPA is use to maintain the data integrity of outsourced data. It is a 
public verifier which acts as an intermediate between user and 
cloud. 

It works in three steps: 

i. Challenge 

ii. Proof 

iii. Verification 



Materials and Methods: 




DISTRIBUTED SERVER 



The system model involves three parties: the cloud server, a group 
of users, a public verifiers. 

A public verifier such as TPA provides expert data auditing ser- 
vices to publicly verify the integrity of shared data stored in the 
cloud server. 

When a public verifier wishes to check the integrity of shared data, 
it first sends, an auditing challenge, the cloud server responds to 
the public verifier with an auditing proof of the possession of 
shared data. 

Essentially the process of public auditing is a challenge and 
response protocol between a public verifier and the cloud server. 

The TPA can be divided as 

i. ProofofDataPossession(PDP) 

ii. Proofs of Retrievability (PoR). 

PDP scheme, are related protocols that only detect a large amount 
of corruption in outsourced data. [1, 2] 

While PoR scheme [3], is a challenge-response protocol that 
enables a cloud provider to demonstrate to a client that a file is 
retrievable, i.e., recoverable without any loss or corruption. Their 
scheme use spot-checking and error correcting codes to ensure 
both "possession" and "retrievability" of remote data files. 

The TPA sends a challenge to the cloud and in response the cloud 
sends a proof to TPA, After receiving the proof the last step is veri- 
fication where it verifies that the data stored by the user is not cor- 
rupted and changed. 



Architecture 
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Flow chart: 



Modules used: 

i. File deduplication 

ii. Block deduplication 

iii. TPA 

iv. Distributed storage server 

Results: 

The distributed deduplication systems is use to improve the reli- 
ability of data. Our model support file-level and block-level data 
deduplication. Deduplication systems uses the Ramp secret shar- 
ing scheme and demonstrated that it incurs small encod- 
ing/decoding overhead compared to the network transmission over- 
head in regular upload/download operations. 

The public auditing scheme provide ease to the user's fear of their 
outsourced data leakage. 

TPA can perform multiple auditing tasks in a batch manner for 
better efficiency. 

Schemes used are secure and highly efficient. 

The adversary cannot deduce any information of the file stored 
through the auditing interaction between CS and TPA. 

Discussion: 

The earlier systems performed the tasking of verifying the data by 
downloading the entire file from the cloud which was costly and 
time consuming. Also , public verifiers were themselves responsi- 
ble for data leakage, therefore to overcome this problem ,TPA is 
used. TPA do not have any knowledge about the data contents 
stored on cloud server during the efficient auditing process .TPA 
can concurrently handle multiple audit session from different 
user for their outsourced data. 

The data security and privacy has always been an issue so in 
future this can be enhanced. In order to make the system reliable 
concept of multiple servers has been introduced, however the 
same file is present at multiple location due to which unnecessary 
storage space is used, this problem can be overcome in future. 
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